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Abstract 

Base station (BS) cooperation can turn unwanted interference to useful signal energy for enhancing 
system performance. In the cooperative downlink, zero-forcing beamforming (ZFBF) with a simple 
scheduler is well known to obtain nearly the performance of the capacity- achieving dirty-paper coding. 
However, the centralized ZFBF approach is prohibitively complex as the network size grows. In this 
paper, we devise message passing algorithms for realizing the regularized ZFBF (RZFBF) in a distributed 
manner using belief propagation. In the proposed methods, the overall computational cost is decomposed 
into many smaller computation tasks carried out by groups of neighboring BSs and communications is 
only required between neighboring BSs. More importantly, some exchanged messages can be computed 
based on channel statistics rather than instantaneous channel state information, leading to significant 
reduction in computational complexity. Simulation results demonstrate that the proposed algorithms 
converge quickly to the exact RZFBF and much faster compared to conventional methods. 

Index Terms — Base station cooperation, Belief-propagation, Distributed algorithm, Message passing. 
Zero-forcing beamforming. 



'Institute of Conmnmications Engineering, National Sun Yat-sen University, Taiwan. Email: chaokai.wen@mail.nsysu.edu.tw. 
^The Department of Optoelectronics & Communication Engineering, National Kaohsiung Normal University, Kaohsiung, 
Taiwan. 

■'■Department of Electronic and Electrical Engineering, University College London, UK. 
^Industrial Technology Research Institute (ITRI), Hsinchu 310, Taiwan, R.O.C. 



1 



I. Introduction 



Multiuser multiple-input multiple-output (MU-MIMO) antenna system has been recognized as an effective 
means to increase capacity in the downlink [1-3]. However, MU-MIMO may not be as effective if edge-of- 
cell users are concerned due to the severe inter-cell interference that is hard to suppress. In recent years, it 
has emerged that letting base stations (BS) cooperate can greatly improve the link quality of the edge-of- 
cell users by turning unwanted interference into useful signal energy, e.g., [4-9] (and the references therein). 
Ideally, by sharing all the required information via high-speed backhaul links, all BSs in a downlink cellular 
network can become a super BS with distributed sets of antennas. This architecture will then allow the use 
of well-known optimal or suboptimal transmission strategies such as capacity-achieving dirty-paper coding 
(DPC) techniques [5,6,10] and zero-forcing beamforming (ZFBF) [7,11], respectively. 

Although DPC is capacity-achieving, it is very complex and massive interest has been to employ ZFBF 
with a simple scheduler to approach near-capacity performance [7, 11]. For example, several testbeds for 
implementing BS cooperation have adopted ZFBF techniques, e.g., [12-15]. Regularized ZFBF (RZFBF) 
is a generalization of ZFBF by introducing the regularization parameter [16, 17]. It has been revealed that 
several beamformers can have a RZFBF structure by selecting the regularization parameter properly [18]. 
Even though information-theoretic studies have provided overwhelming support to RZFBF [18-20], the 
real question is how could RZFBF be implemented in a very large-scale cellular network? 

A straightforward way to implement RZFBF would be to require that there is a central processing unit 
which possesses all the necessary channel state information (CSI) and performs the entire optimization. 
However, as a network expands with more BSs cooperating, it becomes inviable to perform joint processing 
over all BSs because of the limiting backhaul capacity and the excessive computational complexity. It is 
therefore of greater interest to consider an architecture where BSs only communicate with neighboring 
BSs and the overall computation cost is decomposed into many smaller computational tasks, amortized by 
groups of smaller number of cooperating BSs. Motivated by this, in this paper, we propose two message 
passing algorithms to realize RZFBF in a distributed manner. The proposed approaches are particularly 
well suited to cooperation of large clusters of simple and loosely connected BSs. Most importantly, in our 
designs, each BS is only required to know the data symbols of users within its reception range rather the 
entire cellular network, greatly reducing the backhaul requirements. 

The use of distributed methods in beamforming computations has been studied recently in [21-26]. 
Our approach is similar to [21] in that both aim at achieving RZFBF and use belief propagation (BP). 
Nonetheless, the two approaches differ considerably. Our main contributions are summarized as follows: 
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• First, we generalize the earlier results in [21] to incorporate multiple antennas at both BSs and user 
equipments (UEs) and our results can be applied to a wide range of scenarios with complex-valued 
systems. Further, we adopt the approximate message passing (AMP) method in [27] to significantly 
reduce the number of exchange messages. The proposed AMP-RZFBF exhibits the advantage that 
every communication of BS with its neighbors only takes place in a broadcast fashion as opposed to 
the unicast manner in [21]. The used AMP method has recently received considerable interest in the 
field of compressed sensing [27-30] . Our form of the message passing algorithm is closely related to 
the AMP methods in [29] which are a special case of the generalized AMP [30] . 

• In AMP-RZFBF, BSs must compute several matrix inversions for every channel realization and then 
exchange these auxiliary parameters among themselves, requiring very high computational capability 
and rapid information exchange between the BSs. To tackle this, we approximate some of the auxiliary 
parameters by exploiting the spatial channel covariance information (CCoI). The CCoI-aided AMP- 
RZFBF results in significantly simpler implementations in terms of computation and communication. 
With the CCoI-aided AMP-RZFBF, the BSs compute and exchange the auxiliary parameters at the 
time scale merely at which the CCoI changes but not the instantaneous CSI. Simulation results show 
that CCoI-aided AMP-RZFBF achieves promising results, which are different from earlier results 
based on the CCoI, e.g., [31,32], where a performance degeneration is usually expected. 

• Implementing RZFBF in a distributed manner can be achieved by an optimization technique called 
the alternating direction method of multipliers (AD MM) approach in [33] . Applications of ADMM to 
the concerned beamforming problem can be found in [34] (or [33, Section 8.3]). However, it is known 
that ADMM can be very slow to converge. Simulation results will demonstrate that our proposed 
message passing algorithms exhibit a much faster convergence rate when compared to ADMM. 

Notations — Throughout this paper, the complex number field is denoted by C. For any matrix A G 
£MxN^ denotes the (z, j)th entry, while A-^, and return the transpose and the conjugate transpose 
of A, respectively. For a square matrix B, B^, B~^, tr(B), and det(B) denote the principal square root, 
inverse, trace, and determinant of B, respectively. In addition, Ij^ is N x N identity matrix, Ojv denotes 
either an N x N zero matrix or a zero vector depending on the context, and denotes the column vector 
with the ith element being 1 and elsewhere. Finally, || • II2 represents the Euclidean norm of an input 
vector, and E{-} returns the expectation of an input random entity. 
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Figure 1: A downlink model with BS cooperation. 



II. System Model and Problem Formulation 



As shown in Figure 1, we consider a large-scale MIMO broadcast system where L interconnected multi- 
antenna BSs, labeled as BSi, . . . , BS/,, simultaneously send information to K users, labeled as UEi, . . . , UEx- 
In the system, UEfc is equipped with antennas while BS; is equipped with A'^; antennas. Let M = 
'}2ik=i and = Xlti ^i- The received signals at all the UEs can be expressed in a vector form as 

, which is modeled as 
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+ z = Hx + z. 



(1) 



where denotes the transmitted signal from BS /, H^^; G C^^'"^^'- represents the channel matrix from BS 
/ to UE k, and z is the complex Gaussian noise vector with zero mean and the covariance matrix a'^lM- In 
(1), we have defined x G as the vector of the transmitted signal and H G £_MxN overall downlink 

channel matrix. Although (1) appears to look like an M x A^ MIMO system, this is fundamentally different 
from a point-to-point MIMO channel. To see the differences, we emphasize the following two features. 

First, note that H may have many zero block matrices because one UE is only able to receive signals 
from local BSs. The characteristic can be easily described via a graphical model as shown in Figure 1. For 
ease of expression, let U; C {1, 2, . . . , K} comprise the set of user indices such that BS; has some inference 
on these UEs; i.e., Hj ^ ^ for i G U^. Similarly, let C {1, 2, . . . , L} be a set of BS indices such that 
these BSs have some inference on UE^; i.e., Hkj 7^ for j G B^. The local coupling is an important feature 
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that communication should only be required between a subset of BSs rather than among all BSs. 

Secondly, since both BS and UE are equipped with multiple antennas, the spatial correlation of the 
MIMO channel for each link between a BS and a UE should be considered. In this paper, we employ 
the Kronecker model to characterize the spatial correlation of the MIMO channel for each link so that 
the correlation at a BS and a UE is modeled separately [35]. Specifically, the channel from BS/ to UEfe, 
llk,i e C^kxNi^ written as 

H,,, = Ri,W,,,T| „ (2) 

where ^k,l £ C^*^^'= and T^^; G C'^'^^' are deterministic nonnegative definite matrices, which character- 
ize the spatial correlation of the received signals across the antenna elements of UE/j and that of the trans- 
mitted signals across the antenna elements of BS; respectively, and W^.^ = [-^W^^'''^] G C^^^^'- consists 
of the random components of the channel in which the elements {W-j'''^}i<,i<,Mk;i<j<Ni are i.i.d. complex 
Gaussian random variables with zero mean and unit variance. To get a proper definition on the channel 
gain of each link pair, we consider the power of the channel 

E {tr (Hfe,;H^,) } = ^tr (Rfc)tr (T^). (3) 

If we assume that ^ and / are normalized such that trCR/.^i) = Qk,iNi and tr(Tfc^;) = M^, then gj^^i 
can be used as an indicator for the link gain between BS; and UEfc.^ 

In the broadcast system (1), linear preceding, referred to as RZFBF, is used to project the data symbols 
onto a subspace using the N transmit antennas. Let s = [sj , . . . , s^]'^ be the vector of data symbols, where 
s,t corresponds to the data symbols intended for UE^. In RZFBF, the signal vector transmitted by the 
BSs, denoted by x, is given by [16, 17] 

x = aH^(HH^ + ^lM)"^s, (4) 

where a is the normalization parameter to ensure that the transmit power constraint is met, i.e., E{||x;||2} < 
for Z = 1, . . . , L. Note that RZFBF can be regarded as a generalization of other beamformers by adjusting 
the regularization parameter /3. For instance, if /3 = 0, it reduces to ZFBF whereas if ^ — > oo, it will give 
the matched-filter beamforming. Several other beamformers can also have a RZFBF structure by designing 
the regularization parameter appropriately [18]. However, to obtain (4), all the BSs must cooperate to 
jointly process the data symbols from all the users in the network, requiring global CSI. If N and M 
are very large (which they should in order to benefit from the gains of MU-MIMO and BS cooperation). 
Indeed, the link gain can be included in either 'Rk,i or T^^i- 
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the centralized approach will become prohibitively complex. Therefore, in the next section, we propose 
message passing algorithms that can realize RZFBF in a distributed manner. 



III. A Bayesian Approach to Distributed RZFBF 

Now, we are concerning with the problem of distributing the computation of (4) among the BSs. Toward 
this end, we first use the virtual model concept of [21], which recasts the RZFBF optimization problem 
into an estimation problem which is described as follows. 
Consider the virtual model 

s = Hx + z, (5) 

where z G C'^ is the Gaussian random vector with zero mean and the covariance matrix /3Im- Recall that 
s is the data symbol vector for all the users and x is the signal transmitted by the BSs. The virtual model 
implies that the transmitted signal x goes through the channel H and is then observed by s — z at the 
UE sides. What is important here is that the virtual model allows us to process the beamforming problem 
through a probabilistic inference approach. Specifically, we adopt a Bayesian approach. 

The Bayes optimal way of estimating x that minimizes the mean square error is given by [36] 



X = y xp(x|s)(ix, (6) 



where p(x|s) is the posterior probability of x given observation of s. Following Bayes theorem, we have 

where the conditional distribution of s given x under (5) is given by 

p(s|x) = ^^e"?^'==i"^""^'e%"'='''''"l (8) 

If we assume that x is taken from the standard complex Gaussian random vector and its density is given 
by p(x) = ^ , then the posterior distribution p(x|s) admits an explicit expression as 

1 1 v^-ff ||„ H ^ l|2 v^i |1„ ||2 

Henceforth, we shall use Z to denote a universal normalization factor whose value may vary from one 
appearance to another. Plugging (9) into (6) and applying the Gaussian integral (Lemma 1 in Appendix 
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B), one can get that the solution of (6) is exactly identical to the form of (4) without the power normalization 
parameter. Next, we shall use an approach called BP (belief-propagation) for computing (6). 



A. BP-RZFBF 

We begin by applying the standard BP algorithm [37] to perform (6). As a matter of fact, BP can be 
regarded as a graphical method to estimate the marginal distributions of the distribution p(x|s) with 
respect to the variables x;. To this end, wc reformulate the problem as a bipartite graph called the factor 
graph. The corresponding factor graph is depicted in Figure 1 where a circle represents a variable node 
associated with the transmit bcamforming vector; i.e., x; for BS/, whereas a square indicates a factor node 
associated with the sub-constraint function; i.e., ||sfc — ^^gj^^ Hfc^;x;||2 for UEfc. There is an edge between 
a variable node / and a function node k if and only if ilk,i 7^ 0. 

To estimate the marginal distributions, BP performs a set of message passing equations that go from 
factor nodes to variable nodes (i.e.. A; — t- /) and from variable nodes to factor nodes (i.e., I ^ k) as is 
illustrated in Figure 1. The message qk^i from the factor node k to the variable node / is the marginal 
probability of the variable x/ when only the sub-constraint k is present. On the other hand, the message 
qi^k from the variable node I to the factor node k is the marginal probability of the variables x; in the 
absence of the sub-constraint k. 

Specifically, in order to estimate these marginal distributions p(x;|s) with BP algorithm, 2KL messages 
for the probability distributions of the variables x/ are constructed in the following way [37] : 



^^^,) = -L_J dx,cz(*_;;)(x,•)e-^ll^'=-^^^%v"'=-■"^-^'=■'"'ll^ (10a) 



j&k\i 



'"^'^ ieVi\k 

where t = 1,2,... represents the iteration index and Z^^'' and Z'''^^ are the normalization factors ensuring 
that J d^iq^}^i('x.i) = f d:x.iq^^_^^\'Xi) = 1. At the termination of the message passing algorithm, say at 
iteration T, the final estimate of x; is given by x; = J xiqj^ \xi)dxi where q^ (x/) oc 11^=1 Qk^ii^l)- The 
RZFBF solution can thus be realized in a distributed manner via the message passing procedures. However, 
the messages are density functions which are usually too complex to be exchanged and will cost a huge 
burden in the backhaul in our application of BS cooperation. 

To overcome this, the message can be approximated by Gaussian and parameterized by the mean and 



7 



covariance. Instead of passing the density functions, we thus have the mean and covariance as the messages: 



xS!l, = (xO w , (11) 
vSfc = ((xz-x«,)(x,-xS!i,)^)^,, , (12) 

where (/(x;))^^^^ denotes the average or expectation of a function /(x;) over the random vector x; with 
distribution gi_^fe(x;). Mathematically, that is 



{f(.^l))qi^k - J /(xOgZ-^fe(xOrfx,. 



The Gaussian approximation method was introduced in [30, 38] when the message is scalar and the con- 
cerned matrix is sparse. In our case, we follow the techniques in [29] by considering that the block matrix 
Hk^l scales as 0{l/\fNi). As a consequence, we can approximate gfe_;.;(x;)'^*) by 

~{t) , X -(xfE<l,x,-(Fll,)«Xi-xfFi'i,) 



where 



= Kl \ E H,jVf-;)H£. + ^Im, ] H,,,, (14) 

= k\Y. H.. v(.*-;)h^,. + /3Im, 1 I - H,,x5*i,, I . (15) 

Notice that y^j^k and ^ j^k in (14)-(15) are functions of qj^h which is also altered due to the approximation 
qk-^l- To make the connection, from (13) and (10b), we have 

^l\k^^l) p(xOe"^'e"A'=(^''^i^''''-(^i^')''^'-^''^i^0. (16) 

Henceforth, we will replace x[^^ and V^^^^ in (14)-(15) by x[^^ and V^^^^ which are, respectively, the 
mean and covariance over the probability distribution Also, in the sequel, we will no longer use the 

probability distribution However, for notational convenience, we will abuse our notation slightly and 

still use x^*^^ and vj'^j. to denote those mean and covariance over the probability distribution 

Recall that x; is taken from the standard complex Gaussian random vector. By applying the Gaussian 
integral (Lemma 1 in Appendix B), x;^fe and V;^fe with the distribution g;^fe(x;) in (16) can be computed 
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analytically. These lead to the following closed form of the BP update: 

^S.= « + Iiv.)"'F;(l, (17a) 
v!l=(E;*i + I^,)^\ (17b) 

where B^^f^ = EieUAfc " EieUAfe number of messages is still 2KL. However, the 

message update here is only on the mean and covariance rather than the functional update in (10). At the 
termination of the BP, the final estimation of x; is given by 

-^7 



= (If (18) 



where e[*^ = ^-^^^ Bf\i and = J2ieVi We refer to this algorithm as BP-RZFBF although a 

variant of BP is adopted here. In some applications, the regularization parameter P varies for different 
UEs. In these cases, we only have to simply replace P with in (14)-(15). 

BP-RZFBF is a generalization of [21, (27)-(28)] in which Hk/s are scalars (real numbers). Clearly, 
this generalization can be applied to a wide range of scenarios with complex-valued systems. Additionally, 
it performs block matrix computations resulting in a natural partition of BSs. 

B. AMP-RZFBF 

In BP-RZFBF, each BS has to send separate messages with respect to k; i.e., x[*^^ and vj''^^ \/k. We 
can reduce the messaging overhead to 2{K + L). To do so, we note that the messages x^*^^ and V^*^^ are 
functions of e[*], and f|*^^, which are nearly independent of k. However, one must keep all the correction 
terms that arc linear in H^^^. This methodology was first introduced in compressed sensing applications 
in [27] and is referred to as AMP. Using AMP in the BF-RZFBF problem, we have developed the AMP- 
RZFBF algorithm in Algorithm 1. For readability, we give the detailed derivation in Appendix A. 

Now, wc turn our attention to realizing AMP-RZFBF for the cooperative system. In general, each 
iteration requires a broadcast and gathering operation. We assume that each BS has local data information 
and CSI; e.g., only {sfc,H;^fc} for /c G U; are known at BS;. The first two steps of AMP-RZFBF consist of 
performing fi^*^ and u^j^^ updates at BS;. Notice that for BS;, ri[,*'' and i'^*^ updates are only for indices 
/c G U; which correspond to the user indices within its reception range. In order to update fi^*'' and 
BS; must gather Hfc^;V|* ^^^ki ^nd Hfc^;x[* from the set of its neighboring BSs B^. After getting fi^*^ 
and BS; is able to compute {Tlf\iJ,i^^) and then update (xj*-* , V^*'' ) subsequently. Once (x^*-*, V^*^) are 
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Algorithm 1: AMP-RZFBF 



Input: Data symbols for k = 1, . . . , K , channel matrices Tik,l for k = 1, . . . , K and I = 1, . . . , L. 
Output: Return the RZFBF x; for Z = 1, . . . , L. 
1 begin 



Select x[°' 
t 1 
repeat 



0, V[°' = In, , and i^^"' = for = 1, . . . , it' and I = 1, . . . , L; 



,(0) 



(t) 



(*) . 



xf) = (E«+I.,)"^sf>M! 

vp) = (i:r'+i.O"; 

t <s= t + l 

until Predefined number of iterations is met 



computed, BS/ will broadcast Hj^^^V; and Hk,l^i to its neighboring BSs. The algorithm continues 

to repeat the procedures above until it reaches a predefined number of iterations. 

In AMP-RZFBF, the computation of and V^*^ involves several matrix inversions for every channel 
realization. These demand high computational cost and rapid information exchange between the BSs. To 
remedy this, we propose to infer these parameters based on CCoI which varies much slower than CSI. 

C. CCoI-aided AMP-RZFBF 

Starting from the initial condition, we approximate fi^^ by its average with respect to different realization 
of the measurement matrix 




where the equality follows from Lemma 2 in Appendix B. The approximation is benefited by the self- 
averaging property in statistical physics; that is, a quantity per degree of freedom has small deviations 
from its mean. In fact, using techniques from random matrix theory, e.g., [39], one can show that as 
A^; — )■ oo, Q^f!'^ — )■ E{ri^^''} almost surely. We find it useful to denote = -^tr(Tfe^;) and define 

Ri'^ = E (20) 
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Applying the similar argument to in Line 7 of Algorithm 1 for t = 1, we have 



Again, for ease of expression, we also define 



^ E /^tr(RM (rI*^ + f3lM,y')Tk,i. (21) 



Ni 

and 



i'j^^tr(K,,i{R^^+piM,r'), (22) 



Tf^ ^ E 4!lT.,. (23) 



Substituting the above definitions, (21) is then expressed as 



(24) 



Now, xj*-* and v[*'* can be calculated as those in Lines 9-10 of Algorithm 1 but v[*^ = (^f^ + Itv, j is 
approximated by 

vW«(Tf)+Iiv,)"'. (25) 

Note that when t = 1, flf' is given by (19). Let us go ahead on the next round of iteration to get a general 
expression for flf^ for general t. Following the similar argument as that used in (19), we have 



(26) 



Define 



Then (26) becomes 



«S = ^tr(T«(T!-"+I„,)-'). (27) 

E itl^".'- (28) 

fceUi 

Recall that the updates of \ and in Lines 5, 7, and 10 of Algorithm 1, respectively, involve 
the channel realizations {Hj^ ^}. These computations are replaced by (28), (24), and (25), where only the 
CCoI is required. Therefore, Algorithm 1 together with these replacements lead to the simpler iteration 
forms. The algorithmic description of this CCoI-aided AMP-RZFBF is summarized in Algorithm 2. 
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Algorithm 2: CCI-aided AMP-RZFBF 



Input: Data symbols for k = 1, . . . , K , channel matrices Tik,l for k = 1,. 

{Tk,i,Rk,t} for fe = 1, . . . , if and Z = 1, . . . , L. 
Output: Return the RZFBF x; for Z = 1, . . . , L 
1 begin 



. , K and I = 1, . . . , L, and CCI 



Select x' 
t <^ 1 
repeat 



(0) 



0, ly 



(0) 



■ Sk, ti-k 



0, and T 



(0) 



for A; : 



. , K and I = 1, . . . ,L; 



^k,l 



-I- 1 + i-Ni 



-(t)r> 



7f;(t) 



-tr Rfc 



Rfe + 



(Ti" + I„. 



) )^ 



B 



.(*) 



,(t-l) 



+ T, 



-1 



(*)„(«-!) . 



until Predefined number of iterations is met; 



The realization of CCoI-aided AMP-RZFBF is similar to that of AMP-RZFBF but with much lower 
computational complexity and much less communication overhead. Firstly, notice that lines 5-10 of Al- 
gorithm 2 can be computed offline and locally regardless of the channel realizations {H^ data symbols 
{sfc}, and the outputs {x^*-*} of each iteration. Because CCoI can be considered static, the BSs compute 
and exchange these parameters at the time scale at which the CCoI changes rather than the instanta- 
neous channel realizations. This characteristics significantly reduces the computational complexity and 
the communication overhead. Secondly, the remaining two steps, lines 11-12 of Algorithm 2, involve only 
linear matrix multiplications. The update of and x[*^ also requires a general broadcast and gathering 
operation. In particular, to update BS; must gather Hfcjx[* from the set of BSs but it only 

updates i^^^^ for fc G U;. After x[*^ is computed, BS/ will broadcast Hfc^/x|*^ to its neighboring BSs. The 
algorithm continues to repeat the procedures above until it reaches a predefined number of iterations. 

IV. Simulation Results 

In this section, we compare the performance of different algorithms through simulations. The considered 
algorithms include all the message passing algorithms in Section III (i.e., BP- RZFBF, AMP-RZFBF, and 
CCoI-aided AMP-RZFBF) and the ADMM approach in [33, Section 8.3]. ADMM is the state-of-the-art 
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optimization technique and has now been widely used in performing distributed estimations. 

Before proceeding, let us first take a look at the computational complexity of these algorithms. In BP- 
RZFBF, most of computational complexity lies in the matrix inversions in (14)-(15) and (17). Moreover, 
we have to perform these matrix inversions for all the 2KL messages. This gives a complexity of order 
KLO{M^) for each iteration. In AMP-RZFBF, the complexity also lies in the matrix inversions while 
the messaging overhead is reduced to 2{K + L). Therefore, the complexity of AMP-RZFBF is of order 
{K + L)0{M^) for each iteration. The complexity of the AD MM approach is comparable to AMP-RZFBF. 
Finally, the complexity of CCoI-aided AMP-RZFBF is further reduced from AMP-RZFBF because the 
matrix inversions are performed at the time scale at which the CCoI changes. Therefore, the computational 
complexity of CCoI-aided AMP-RZFBF is of order {^^^^) 0{M^) for each iteration where r represents the 
time scale at which the CCoI changes. The value of r could be very large because CCoI can be considered 
static. Consequently, CCoI-aided AMP-RZFBF can be implemented in the most efficient way. 

With the computational complexity in mind, our attention turns to their performances. We consider 
a cellular system with 100 BSs and 100 UEs in which each BS is equipped with 8 transmit antennas 
and each user has 4 receive antennas, i.e., L = 100, K = 100, Ni = 8, and = 4. The propagation 
channel matrix between each BS and UE is characterized by (2), where the spatial correlations 'Rk/s and 
Tfe^/'s are arbitrarily generated with elements being [Rfc^;]jj = Pr^^' and Tfc_;]jj = PjJ^\ respectively. 
Additionally, the link gain g^ i is included in R^^.,; and is also uniformly and randomly generated. Figure 
2 illustrates the average throughput of the algorithms varies with the number of message transfers. The 

average throughput is calculated by ^ Ylk^i EmU log2 (l + T^fc) "^^^""^ ^m,k - |eJ;(HjS-iI)p+a2 
x(*) is the vector of transmitted signals at the t-th iteration. Here, denotes [H^^i • • • tlk,L] and em has 
been defined in Notations. The results provided are for a particular realization of the channel. It is natural 

that when the number of iterations increases, the average throughput increases and saturates eventually. 
Here, RZFBF in (4) serves as a benchmark for the optimal beamformer. From Figure 2, it can be observed 
that the proposed message passing algorithms converge significantly faster than the ADMM approach. The 
convergence rates of all the proposed message passing algorithms are very similar. 

Recall that AMP-RZFBF follows from BP-RZFBF but using the approximations that Ejy,^. and F;\^;j 
are nearly independent of k. This approximation is expected to be good if K and L are extremely large. 
Furthermore, the CCoI-aided AMP-RZFBF uses the large system approximation by assuming Ni ^ oo. 
Although the setting in Figure 2 corresponds to a practical system dimension, it is intriguing to sec their 
performances under a relatively small network; e.g., L = 16, K = 16, Ni = 4, and = 2. Under the 
small network consideration, Figure 3 illustrates the convergence of the algorithms. Similar characteristics 
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Figure 2: Average throughput against the number of iterations for the message passing beamformers and 
global beamformer when L = 100, K = 100, Ni = 8, = 4, and /3 = cr^ = lO'^. 
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Figure 3: Average throughput against the number of iterations for the message passing beamformers and 
global beamformer when L = 16, K = 16, A''; = 4, Mj. = 2, and /3 = = 10^^. 



as in Figure 2 before are observed. Additionally, comparing to BP-RZFBF, AMP-RZFBF and CCoI-aided 
AMP-RZFBF only slightly degrades the convergence rate. This result is quite different from several earlier 
designs based on CCoI, e.g., [31,32]. Usually, when some calculations are approximated by the CCoI, an 
obvious degradation in performance would be observed but this is not the case in our scheme. 

V. Conclusion 

Using Bayesian inference, this paper proposed several message passing algorithms for realizing RZFBF 
in cooperative-BS networks, namely, BP-RZFBF, AMP-RZFBF and CCoI-aided AMP-RZFBF. Results 
showed that the proposed algorithms converge very fast to the exact RZFBF. Comparing to BP-RZFBF, 
both AMP-RZFBF and CCoI-aided AMP-RZFBF perform well with only very slight degradation in the 
convergence rate, but greatly reducing the burden for information exchange between the BSs. 
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Appendix A: Derivation for AMP-RZFBF 



To derive AMP-RZFBF, we use a heuristic approximation which keeps all the terms that are linear in the 
matrix Hi^k while neglecting the higher-order terms. The similar methodology was used in [29] in the case 
of compressed sensing although some modifications are required to reflect the concerned case. 

We start by noticing that E^^^^ = X^j^/j E^^^ is the sum of K terms each of order 1/iV/ because H/^^ 
scales as 0{l/\/7^i). Therefore, it is natural to approximate E^^j^ by E^^*^ = '^f^i which only depends on 
the index / and not on k. Similarly, it is natural to anticipate a similar approximation for f|*^. However, 
we must be careful to keep all correction terms of order l/^fWi. To that end, we instead set 

= - AfS,. (29) 
Recah from (17) that xj^.^. = + liV;) ^ Then we get 

+^Ni) i^i - (E; +\-Ni) A*;^fe 



)-'p1"-(eJ"+i„.)-' 

(Ei"+I«,)"'AFSi. (30) 



We will approximate the above two terms by dropping their negligible components. Before proceeding, we 
deal with the approximation of E^^ Let us define O^*^ = Y^- Hkj^qjl^^H^j. Then we have 

Ef ) = y: Ki - + ^iM,) " H,, 

k 

^Y.Ki [^f + p^M.y' ^k,i ^ (31) 

k 

where the approximation follows from the fact that YLh^iVf^^H^i is of order 1/iV/ and can be safely 
neglected. Similarly, we note that V^^^^^ is nearly independent of A;. This leads to 

Then we get 

= E H.. Vj^H- . ^ H,, Vy-)H-,. (33) 
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Now, we return to the approximation of x[^^. First, we deal with the second terms of (30) and get 



where the first approximation is directly from (30) by substituting the definition of AF^*^^ and we have 



defined i/^*^ — 



HfejXq*^^^- Substituting the above approximation of xj^^ in i/^j^' , we get 



k ^ 



(34) 



where the second equality follows from (33). 



Now, it remains to complete the calculation of x[*^. We start from the definition 



where we have defined 



(Ef')-p,"' 



E 



(t) 



+ 1 



Following the similar approximations as above, we get 



it) 



and then 



(35) 



(36) 



(37) 



(38) 
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Putting the above relations (31), (33), (34), (37) and (38) together, we get AMP-RZFBF. 



Appendix B: Lemmas 

For convenience, we provide some mathematical tools needed in this paper. 
Lemma 1 Given a positive definite matrix A, we have 

^ J ^e-^"^^+''"^+^"''d^ = A-ib, (39) 
where Z is a normalization factor such that 1/Z J g-x^-A-x+b^x+x^b^^ _ ^ 

Lemma 2 A random matrix X G C^^^ is said to have a matrix variate complex Gaussian distribution 
with mean X and covariance matrix B (g) A, if it can be written by 'X. + Aa WBa , where A G C^^^ and 
B G C^^^ are both positive definite and the elements o/ W are i.i.d. complex Gaussian random variables 
with zero mean and unit variance. Then we have 

E{XCX^} = XCX^ + tr(BC) A, (40) 
E{X^DX} = X^DX + tr(AD)B. (41) 
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